Discriminative learning from partially annotated examples

نویسنده

  • Kostiantyn Antoniuk
چکیده

A number of algorithms and its applications for automatic classifiers learning from examples is ever growing. Most of existing algorithms require a training set of completely annotated examples, which are often hard to obtain. In this thesis, we tackle the problem of learning from partially annotated examples, which means that each training input comes with a set of admissible labels only one of which is correct. We contributed to two different cases of this scenario. In the first case, we studied the problem of learning the ordinal classifiers from examples with interval annotation of labels. We designed a convex learning algorithm for this case and demonstrated its advantage on real data empirically. At the same time, we made several contributions to the supervised learning of the ordinal classifiers, namely, we proposed new parametrization of the ordinal classifier, we introduced more flexible piece wise version of the ordinal classifier, and we proposed a generic cutting plane solver with convergence guarantees. In the second case, we studied the problem of learning the structured output classifiers from examples with missing annotation of a subset of labels. We have defined the concept of a surrogate classification calibrated partial loss, the minimization of which guarantees that learning is statistical consistent under fairly general conditions on the data generating process. We proved the existence of a convex classification calibrated surrogate loss for learning from partially annotated examples. We showed which existing surrogate losses are classification calibrated and which are not. Our work thus provides a missing theoretical justification for so far heuristic methods which have been successfully used in practice.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interval Insensitive Loss for Ordinal Classification

We address a problem of learning ordinal classifier from partially annotated examples. We introduce an interval-insensitive loss function to measure discrepancy between predictions of an ordinal classifier and a partial annotation provided in the form of intervals of admissible labels. The proposed interval-insensitive loss is an instance of loss functions previously used for learning of differ...

متن کامل

Weakly supervised discriminative localization and classification: a joint learning process

Visual categorization problems, such as object classification or action recognition, are increasingly often approached using a detection strategy: a classifier function is first applied to candidate subwindows of the image or the video, and then the maximum classifier score is used for class decision. Traditionally, the subwindow classifiers are trained on a large collection of examples manuall...

متن کامل

Name Tagging with Word Clusters and Discriminative Training

We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is encoded in features that are incorporated in a discriminatively trained tagging model. Active learning is used to select training examples. We evaluate the technique for named-entity tagging. Compared with a state-of-...

متن کامل

Learning discriminative localization from weakly labeled data

Visual categorization problems, such as object classification or action recognition, are increasingly often approached using a detection strategy: a classifier function is first applied to candidate subwindows of the image or the video, and then the maximum classifier score is used for class decision. Traditionally, the subwindow classifiers are trained on a large collection of examples manuall...

متن کامل

Learning Transferable Representation for Bilingual Relation Extraction via Convolutional Neural Networks

Typically, relation extraction models are trained to extract instances of a relation ontology using only training data from a single language. However, the concepts represented by the relation ontology (e.g. ResidesIn, EmployeeOf) are language independent. The numbers of annotated examples available for a given ontology vary between languages. For example, there are far fewer annotated examples...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016